This Page Is Inserted by IFW Operations 
and is not a part of the Official Record 



BEST AVAILABLE IMAGES 

Defective images within this document are accurate representations of 
the original documents submitted by the applicant. 

Defects in the images may include (but are not limited to): 

• BLACK BORDERS 

• TEXT CUT OFF AT TOP, BOTTOM OR SIDES 

• FADED TEXT 

• ILLEGIBLE TEXT 

• SKEWED/SLANTED IMAGES 

• COLORED PHOTOS 

• BLACK OR VERY BLACK AND WHITE DARK PHOTOS 

• GRAY SCALE DOCUMENTS 



IMAGES ARE BEST AVAILABLE COPY. 



As rescanning documents will not correct images, 
please do not report the images to the 
Image Problem Mailbox. 



VOLUME I GENERAL PRINCIPLES 



MOLECULAR 
BIOLOGY 
OF THE 

GENE FOURTH EDITION 



James D. Watson 
Nancy H. Hopkins 
Jeffrey W. Roberts 
Joan Argetsinger Steitz 
Alan M. Weiner 



COLD SPRING HARBOR LABORATORY 

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 

CORNELL UNIVERSITY 

YALE UNIVERSITY * 

YALE UNIVERSITY 



The Benjanun/Cummings Publishing Company / Inc. 



Menlo Park, California • Reading, Massachusetts • Don Mills, Ontario 
Wokingham, U.K. ■ Amsterdam • Sydney • Singapore 
Tokyo • Madrid * Bogota • Santiago • San Juan 



Cover art » a computer-generated image of 
WAntwting wi th the Cro represser 
Protein of bacteriophage A. The image was 
prepared by the Graphic Systems ^searcl 
Group at the IBM U.K. Scientific CentrV 



Editor: Jane Reece Gillen 
Production Supervisor Karen K. Gulliver 
Bditonal Production Supervisor Betsy Dilernia 
Cover and Interior Designer: Gary A Head 

Art Coordinator: Pat Waldo 

Art Du-ector and Principal Artist: Georg Matt 

Z^VZZZi?* 1987 ^^ Cummings 

electronic, mechanical, photocop^g 5 m ^ S ' 

pnor written pentnssion ot thep2 h « p^f °f °* erwise ' the 

America. Published simultaneously £ SnldT m ^ ^ ° f 



Library of Congress Cataloging-in-PubHcation Data 
Molecular biology of the gene 

WatLl^S^f bi ° losy of * e ■« ' a 

Bibliography 
Includes index. 



ABCDEFGH1J-MU-89876 



Menlo Park, California 94025 



/';■ PETERS 

? One Gene-one pj 

Product meGeneand ^Poi^ eptide 



A Second Am* 
^anscripnonaj Units 

Segments of Chrom 6 Punda mentaJ 

The Eventual 

Chro.noson.e * the Entire £. 

■ Summary 

Bibliography 



214 

214 

217 
228 
220 
220 

222 



224 

225 

226 

228 

228 
229 
230 

231 



233 
234 



The Two Oi • 

cacn Base J-£q t 
„ Sle " Stra nded Diva M o„„ 

^idues After and Ad • 249 

s ^%c^ 252 

Spontaneous Deform T 8 t0 2 ^Hon 
Solution ^^oftheo^^ 253 

U ^din g of A and K^- 254 

.^^SS^H*^^ 1 ^ 2 * 

^P-coiiin^T^^^Mo^ ^ 

s rce UuI rctsr^ s ~^ A r 

S «Perc oi ] s as P«.tein- Co „ fe 259 

DIVA Sup erco jj^ iVvice A * , 

Bmdi «e ProteS COnta « H*toneI*e DlC 263 



234 



. "& Linear- r»Kr 4 . . 

262 



262 



'art IV 

in DefaiJ 

IAPTEK 9 ~ 

A Is Usually a D „ NA 
y a Dou We Helix 



236 
237 
238 



269 



240 

240 



^ng Lin ^ ° f 

C«nplli £a " i '-I'NA Ifeci>8niaoriS . tt 2 « 
Separating DNA n ^ ymes ° m 

DNA 6 ra 8n»ents to Create R 271 

Sequenced Can Be 273 

274 



Fax:03-3436-1079 20035 1 22B < ^ > 18:56 P. 08/16 

Mutable Sites Are the Base Pairs Along the Double Helix 223 



0.04 



Genetic map 



Amino acid 
sequence 



Amino acid found 
in wild-type enzyme 

Amino acid found 
in mutant enzyme 



Ml I 
K»H-0.3-«f* 

I I I 



446 487 

i r 

i i 

i i 

i i 

! I 

I I 



223 
I 
I 
I 
I 
I 
I 



I I I 

I I I 

175 177 183 



-f^tyr ten 



II 



Ihr 



0.4 



0.06 

I I I 
«t— t- 

I I 



0.02 



0.50 



25 l'87 
I I 
I I 



I I 
I I 



• cys org ileu 



I I 

I I 

211 213 

gly 

org val 



Hi 

— H— 
I I 

I I 

I I 

I I 

I I 

I I 

I I 



I I 

I I 
234 235 

gly «er 

l| 

asp leu 



amino acids. This sequence allows us to see how the location of a 
mutation within a gene is correlated with the location of the replaced 
amino acid in its polypeptide chain product. Since both genes and 
polypeptide chains are linear, the simplest hypothesis is that amino 
acid replacements are in the same relative order as the mutationally 
altered sites in the corresponding mutant genes. This was most pleas- 

inelv dpmnnfitratpH in 1 Q6d TVt» l™-»firvn oarK ^y^^^ifi^ ^~:a 

replacement is exactly correlated with its location along the genetic 
map, a property called colinearity. Thus, successive amino acids in a 
polypeptide chain are controlled, or coded, by successive regions of a 
gene. 



Mutable Sites Are the Base 
Pairs Along the Double Helix 

In all bacterial genes extensively mapped, the large number of lin- 
early arranged mutable sites that have been found in each gene, and 
between which genetic recombination (crossing over) is possible, 
leaves us no choice but to conclude that these sites are the specific 
base pairs along the DNA of the respective gene (Figure 8-12). A 
given mutable site can thus exist in any of four different states, AT, 
TA, GC, or CG. Many mutations are therefore likely to represent 
simple switches from one state to another. The genetic data that re- 
veal deletions and insertions of genetic material must now be thought 
of in terms of the addition or deletion of discrete blocks of one to very 
many base pairs. The three classes of mutations resulting from 
changes in the sequence of nucleotide bases are illustrated in Figure 
8-13. 

By carefully studying the fine details of genetic maps, we should be 
able to obtain important information about the corresponding DNA. 
However, not every change in base sequence leads to easily observed 
changes in the corresponding protein. In the genetic code, many 
amino acids are specifiied by more than one codon (set of three adja- 



Figure 8-11 

Colinearity of the gene and its protein 
product; Here is the genetic map for 
one-fourth of the gene coding for the 
amino acid sequences in the E. coli pro- 
tein tryptophan synthetase A. The des- 
ignation 0.04, for example, refers to 
map distances (frequencies of recombi- 
nation) between tryptophan synthetase 
mutations A446 and A487. The num- 
bers in the amino acid sequence refer to 
their position in the 267 residues of the 
A protein. Following convention, the 
amino terminal end of the segment is 
on the left. 



224 The Fine Structure of Bacterial and Phage Genes 



Figure 8-12 

The relationship of mutations in the rU 
region of the phage T4 chromosome to 
the structure of DNA. 



rll region represents —2% of the total genetic 
map (4 X 10' nucleotide pairs) 



T4 chromosome contains 
2 X 1 0 s nucleotide pairs 



I 



The rll region 
comprises two separate 
' genes, a mutation in either of ^ 

rllA gene(— 2500 nucleotide pairs) which produces the r character. 



t 



rllB gene ( — 1500 nucleotide pairs) 



/I 



Magnified view of a short section of the rllA ^ ^, 
gene. Those mutations that map close to each other^ ^ ^ 
probably represent changes in adjacent nucleotide pairs. ^ 

H III I l—l I I I h-H 



Small segment of rllA gene 
(—100 nucleotide pairs) 




cent bases), which means that in many cases, base-pair substitutions 
will not lead to any amino acid replacements. Moreover, as we docu- 
ment later, many of the amino acids in proteins are not essential, and 
when they are replaced by somewhat similar amino acids, the pro- 
teins often retain full activity. The number of observed mutable sites 
therefore seriously underrepresents the number of base pairs within 
the corresponding gene. 



i There Are Four Alternative 

! Structures for Each Mutable Site 8 - 9 

I 

| As anticipated, enzymatically inactive tryptophan synthetase mole- 
S cules resulting from independent mutations at the same mutable site 
(as shown by failure to give wild-type recombinants) do not always 
contain the same amino acid replacement. For example, changes in a 
single mutable site that specifies the amino acid at position 213 results 
in the replacement of glycine by either glutamic acid or valine. Inspec- 
tion of the genetic code (see Chapter 15) indicates that in the wild- 
type strain, this glycine must be specified by either GGA or GGG 
codons and that the mutable site under study specifies the G in the 
middle position of this codon. When this G is replaced by U, valine 
(GUA or GUG) becomes inserted into the glycine site while its re- 
placement by A generates the glutamic acid (GAA or GAG) substitu- 
tion. Further study of this particular mutable site might eventually 
turn up the anticipated third replacement in which a G to C switch 
leads to the appearance of alanine (GCA or GCG). 



Fax .C3-343G-1079 3003ff 12S2?Bf R> 18-57 P, 10/16 

SmgJe Ammo Acids Are Specified by Several Adjacent Nucleotide Bases 225 



— i i i i ii i i i 1 i I I i I l I — 

A-TTGCATCGACCTAGCT ^ 

M ii ii in in ii n in tn ii in in ii ii in in n Wild-type gene 

TAACGTAGC TGGAT C G A 

I 1 1 I I I I 1 I 1 11 1 I 1 1 I 



I I 1 I il i) I I 1 I I 1 1 1 I I I 

A TTGlT'AT CGAC CTAGC T 

ii ii it m | ii hi ii tti ni it in in ii ii ui tn mi Base pair changed 

TAA C i A I T A G C TGGAT CGA r * 



I 1 I I [Ijl 11)11111111 



Figure 8-13 

Three classes of mutations result from 
introducing defects in the sequence of 
bases (A, T, G, C) attached to the back- 
bone of the DNA molecule. In one 
class, a base pair is simply changed 
from one into another (i.e., GC to AT). 
In the second class, a base pair is in- 
serted (or deleted). In the third class, a 
block of base pairs is deleted (or in- 
serted). 



I I I I M I I Ijlll III I I I I 

S I I 8 £ ft I fr, ?, lifr £ £ ,! * g £ I f*g«» 

TAAC GT AGC|A<T GGA TCGA base pair 

i i i i i i ni l i i i i i i i 



rti 



I I 1 1 1 111 I I I I II 

A T T GCAlT AGCT AC 

ti tt ii tti mi ii | ii o m m ii ii ttt Deletion of a block 

i f ? f ? I'm 1 ? ? t T ? of six boss pairs 



I I I I . MJJ I I I I I 



Single Amino Acids Are Specified 
by Several Adjacent Nucleotide Bases 

We expected to find that given amino acids within a particular protein 
are specified by adjacent mutable sites. This point was first demon- 
strated in the tryptophan synthetase A gene, where the relevant evi- 
dence came from study of the tryptophan synthetase fragment illus- 
trated in Figure 8-14. Treatment of the wild-type strain with a 
mutagen had given rise to mutant A23, in which arginine replaces 
glycine (this time at position 212), and mutant A46, in which glutamic 
acid replaces glycine at the same position. The difference between 
A23 and A46 does not represent changes to alternative forms of the 
same mutable site, since a genetic cross between A23 and A46 yields a 
number of wild-type recombinants (glycine in position 212). if these 
changes were at the same mutable site, no wild-type recombinants 
would be produced. Moreover, the very low observed frequency of 
the wild-type recombinants is compatible with the prediction from 
the genetic code that these mutable sites are adjacent to each other. 
Additional genetic evidence that confirms the separate locations of 
the A23 and A46 mutable sites comes from observing how A23 and 
A46 themselves mutate upon treatment with mutagens. After expo- 
sure to a mutagen, both strains give rise to new strains, some of 
which contain active tryptophan synthetase A chains with glycine in 
position 212. These reverse mutations most likely involve changing 
the altered mutable sites back to the original wild-type configuration. 
However, strains containing active tryptophan synthetase also arise 



226 The Fine Structure of Bacterial and Phage Genes 



Figure 8*14 

Demonstration that a single amino acid 
is specified by more than one mutable 
site. We now know that the mutable 
sites are DNA bases and the codons are 
actually bases complementary to these 
in mRNA. (After Emanual J. Murgola.) 



Wild-type tryptophan 
synthetase gene 



Amino acid 
sequence 



Codon G C A 



Gly 



Each square represents 
a mutable site. 



Mutant 
A23 gene 



Site of A23 
mutation 



I 

AGA 



Site of A46 
mutation 



i 

GAA 



G(u 



Mutant 
A46 gene 



Genetic cross between mutants A23 and A46i 



A23 
(Arg) 



A46 

(Glu) 



AGA 
X 




GAA 



G G A 
0.002% wild-type 
recombinants with 
glycine in position 21 1 



in which the amino acid in position 212 is replaced by another arnino 
acid. Most significantly, the type of replacement differs, for strains 
A23 and A46. Besides back-mutating to glycine, strain A23 mutates to 
threonine and serine, whereas A46 mutates to alanine and valine in 
addition to glycine. The failure of A23 ever to give rise to alanine or 
valine and the failure of A46 ever to mutate to threonine or serine is 
very difficult to explain if their differences from wild type are based 
on alternative configurations of the same mutable site. But these mu- 
tational patterns make perfect sense if glycine at the 212 position is 
coded by GGA with the A23 mutation to arginine representing a G to 
A charige at the Erst position of the codon to give rise to AGA and the 
A46 mutation to glutamic acid occurring at the middle (second) posi- 
tion to give rise to GAA- Their divergent subsequent mutations to 
serine and threonine and to alanine and valine, respectively, can also 
be understood by inspecting the genetic code (Figure 8-15). 



Single Amino Acid Substitutions 
Usually Do Not Alter Enzyme Activity 

The ability of a polypeptide chain to be enzymatically active does not 
require an exactly specified amino acid sequence. This is shown by 
examination of the new mutant strains obtained by treating strains 
A23 and A46 with mutagens. The possession of either glycine or ser- 
ine in position 212 yields a fully active enzyme, whereas threonine in 



ffi« : 03-J436-1079 



2005^12^^?B<^ > 18:57 



P. 12/16 



Sfagfe Ahhio Acid Substitutions Usually Do Not Alter Enzyme Activity 



227 



Witd-type tryptophan 
synthetase gene 

Amino acid sequence 



-Codon coding for Gly 



Gly 



Mutant 
gene A23 



Mutant 
gene A46 



Arg 



GIu 



Figure 8-15 

Formation of mutants A23 and A46 and 
their subsequent mutations. Notice that 
Thr and Ser cannot result from a single 
base change to the codon for Glu; like- 
wise, Ala and Val cannot result from 
only one base change to the codon for 
Arg. Therefore, the A23 and A46 mu- 
tants must occur from mutations at two 
different mutable sites, as shown in 
Figure 8-14. 



Thr 



Ser 



Ala 



Vfal 



^J! m !u^ 8 ?° n y* 8 " 8 enzyme re duced activity, demon- 
strating that the activity of an enzyme does not demand TperSy 
umqu^a^uno acid sequence (Figure 8-16), In fact, evidence now indi- 
cates that amino acid replacements in many parts of a polypeptide 
cham can occur without seriously modifvin* cataMir tj_ 
ever, one sequence may often be best suited to'a cell's particular 
needs and it is this sequence that is encoded by the wild-type allele 
Even though other sequences axe almost as good, they will tend to be 
selected against in evolution. 7 



Wild-type gene 

- Amino acid 
sequence 



Mutant gene A23 



Additional mutations that 
restore enzymatic activity 




This gene produces 
a partially active enzyme. 



Gly 



Mutation to toss 
of enzymatic activity 





This gene produces 
a completely active enzyme. 



This gene produces 
a completely active enzyme. 



Figure 8-16 

Evidence that many amino acid replace- 
ments do not result in loss of enzy- 
matic activity. 



228 The Fine Structure of Bacterial and Phage Genes 



A Second Amino Acid Replacement 
May Cancel Out the Effect of the First 10 

The conclusion that minor changes to amino acid sequence do not 
significantly alter enzyme activity is extended by the finding that 
some mutations that convert inactive mutant enzymes to active forms 
may work by causing a second amino acid replacement in the mutant 
enzyme. Consider mutant A46, which produces inactive tryptophan 
synthetase because of the substitution of glutamic acid for glycine at 
protein 212. In this case, distant second-site mutations that result in 
the active enzyme occasionally emerge. For example, the second-site 
mutation A446 is located one-tenth of a gene length away from the 
first mutation. The double mutant A46A446 produces active enzyme 
molecules containing two amino acid replacements: the original 
glycine-to-glutamic acid shift and a tyrosine-to-cysteine shift located 
36 amino acids away (Figure 8-17). 

The second shift can be studied independently of the first by ob- 
taining recombinant cells with only the A446 mutation. Most interest- 
ingly the A446 change, when present alone, also results in an inactive 
enzyme. We thus see that a combination of two wrong amino acids 
can produce an enzyme with an active three-dimensional configura- 
tion. However, only occasionally do two wrong amino acids cancel 
out each other's faults. For example, double mutants containing A446 
and A23, or A446 and A1S7, du not produce active enzyme. At this 
time, it does not seem wise to speculate on how the various amino 
acid residues are folded together in the three-dimensional configura- 
tion and why only some combinations are enzymatically active. This 
kind of analysis must await the establishment of the three-dimen- 
sional structure of tryptophan synthetase. 

The Very Drastic Consequences of the 
Insertion or Deletion of Single Base Pairs 11 12 

Early on in the analysis of mutant proteins, it became clear that the 
vast majority of mutants being isolated did not yield the minimally 
altered proteins, bearing single amino acid replacements, that would 
arise through the change of one type of base pair into one of its three 
alternatives. Instead, most mutants represented changes that led to 
drastically altered gene products, often containing many fewer amino 
acids and with many of their amino acid sequences bearing no rela- 
tionship to the wild-type polypeptide products. The nature of these 
mutants first became apparent through the proposal that such muta- 
tions usually represented either insertions or deletions of single nu- 
cleotide pairs. The drastic effect of these insertion or deletion events 
is a consequence of the fact that mRNA molecules are read in succes- 
sive blocks of three nucleotides, called codons. AUG codons, which 
code for the methionine residues found at the amino terminal ends of 
newly synthesized polypeptide chains, are the signal for ribosomes to 
begin reading the mRNA molecule about to be translated into a pro- 
tein. Since reading always begins at the appropriate AUG condon, 
the mRNA molecules are aligned on the ribosomes so that their mes- 
sages are read in the correct reading frame. 

If, however, a single base pair is inserted or deleted in a coding- 
sequence, the triplets that designate amino acids become completely 
changed beginning at the site of insertion or deletion (Figure 8-18). 



Fax = 



2003SC12^22H<^> 18:58 P. 14/16 

Reversion of Insertion or Deletion Mutants 229 



Site of 

A446 mutation 

I 



Site of 

A46 mutation 
- I . 



Figure 8-17 

Reversal (suppression) of mutant phe- 
no s type by a second mutation at a sec- 
ond site in the same gene. 



Wild-type protein 
enzym aticaily 
active 



174 



UAU 



175 



176 



GGA 

mm 

210 211 212 



A46 mutant 
protein 
enrymotically 
inactive 



GAA 



A446 mutant 
protein 
enzymaticaily 
inactive 



UGU 



A46A446 mutant 
protein 
enzymalically 
active 



cys 



For example, if normally the gene sequence ATTAGACAC ... is 
read as (ATT)(AGA)(CAQ . . . , then the insertion of a new nucleo- 
tide C in the fourth position of that sequence creates ATTCAGACAC 
which is read as (ATT)(CAG)(ACA)(c\ . . ). These new triplets ma~y 
code for entirely different amino acids. A similar consequence follows 
from a deletion. Moreover, the crossing of two deletion or two inser- 
tion mutants yields double mutants in which the reading frame is still 
misplaced. 

Reversion of Insertion 01 Deletion Mutants 

Active (or partially active) genes are regenerated by crossing over 
between an insertion and a nearby deletion. Such events restore the 
correct reading frame except in the short region between the muta- 
tions (see Figure 8-18). If the affected gene region is nonessential 
(e.g., the early section of the T4 rllB gene), then the resulting protein 
product is fully functional. In other cases, the short segments of inap- 
propriate amino acids are only mildly disadvantageous, and partial 
activity results. No activity, however, will usually be found if the 
inappropriate codons include any of the three that signify chain ter- 
mination (UAA, UAG, or UGA). Their presence inevitably results in 
incomplete fragments of the wild-type polypeptide. 

It is also sometimes possible to obtain functional genes by produc- 
ing recombinants containing three closely spaced insertions or dele- 
tions (Figure 8-19). In contrast, recombinants containing four nearby 
insertions or deletions produce only nonfunctional polypeptides. 
These later experiments were performed in 1961, before the basic out- 
lines of the genetic code were known. They in fact provided the first 
good evidence that the genetic code was likely to be read in groups of 
three as opposed to groups of two or four. 



230 The Fine Structure of Bacterial and Phage Genes 



Figure 8-18 

Mutations that add or remove a base 
shift the reading frame of the generic 
message. 



Only one of the two complementary strands is shown here. 



M I I 1 1 1 I I 1 1 I 1 1 I 1 1 I I 1 1 I " 



TAGCAT TATTACGATATTAGG 

*' " M tl M II || |t 

■ Jl---Jt II J1 II II II IL 



The reading of the genetic 
code (I.e., selection of 
the correct amino acids) 
always begins from one 
end of the template. 



Normal genetic message 
codes for amino acid 
sequence In a functional 
protein 

Bach amino group is 
coded by a group 
of three nucleotides 



Insertion of a 
single nucleotide. 

J 



The number designates 
one of the twenty amino 
acids 

Polypeptide product 



I I I I t I I I I I I I I I I I I I I I " 

TAGCATT AT GT ACGATATTAG 

II I* II II It II tl I 
tt 1| ii it it ii ii , 



3 14 



1 



1 

Incorrect amino acids 



Mutant genetic message 
containing insertion 
of a single nucleotide 

Polypeptide product has 
no biological activity 



r' c 

- * 1 1 | | I H I M 1 I M I i I i I I I i • " Murant aenetic ma " a s ft 

it ii m i, ,i u u ,, of s.ngle nucleotide 

— »i 1' n ii ii n ii ii 

Polypeptide product has 
10 0 7 13 14 If ^ no biological* activity 



I 

Incorrect amino acids 



^ | C s Site of crossing over between deletion and.' insertion mutants 

" " I I I I I I I I I | I | I | M | I I I | V " * Recombinant generic 
TAGATTATGTACGATA TTAGG message containing 

ii || || || ,| ,| ,| ,, both an Insertion and 

— li 11 — 11 ii- «<- ii— »t_ rt a. deletion mutation 

1 6 17 4 5 6 7 

- 4TT l -BtHBjr ! *W « « Polypeptide product has 

1 » |» P> JT^1JLJ only two Incorrect 

, ' amino acids and may 

Incorrect amino acids have biological activity 



Cloned Genes Can Be Sequenced 13 " 17 

Virtually all the essential features of the genetic code were deduced 
by 1966 from the coding properties of either enzymatically or chemi- 
cally synthesized m&NA molecules and from the accumulated knowl- 
edge of genetic fine structure that we have just detailed. No real 
genes were directly analyzed, however, since at that time there were 
no procedures either to sequence DNA or to isolate desired genes. 
But with the arrival of recombinant DNA and of powerful methods 
for DNA sequencing, the nature of genetic research has dramatically 
changed. No longer are genetic crosses the prime vehicle for probing 
genes. The quickest and most direct way to proceed is now the clon- 
ing and sequencing of relevant genetic material. As indicated in the 
previous chapter, it is now a relatively straightforward matter to iso- 
lat any E, colt gene that codes for a function that can be selected for 
by one of the many enrichment procedures. 



Tax ; 03-3436-1079 



200W123 33B<:fli 18:58 



Untranslated Sequences at the Beginnings and Ends of mRNA Molecules 231 



Only on© of the complementary chains is shown here 

" I 1 I M I I 1 1 I 1 I I 1 I I 1 1 1 1 1 I | 1 | 

TAGCATTATTACGATATTAGGCCT 



3n nucleotides 



f- 



Normal gene 

ii ii 1 1 <i ii . . " (codes for the amino 

-J I ii ii ij_ it J J [' J J acid sequences in a 

12 3 4 



Reading of the genetic 
code always begins 
at this end of the gene. 



5 6 7 
■ 

Amino acid 



8 



functional protein) 
i«sm n amino adds 



"til ill 1 1 1 1 1 1 i t l 



3(n -r- 1 ) nucleotides 

♦ cUmuuuumu!'" 

J< " " ii ii ti ii u 
— >» ii II ll 11 II || il 

1 10 19 r 6 20 n 



11 H 
II 1| 

7 8 




mm n + T 

amino acids 



Incorrect amino acids 



Polypeptide chain contains five incorrect amino acids; its chain length is increased by c 
amino acid, I t may have some biological activity depending upon how the five wrong 
amino acids influence its 3-D structure. ' "~ ~ 



Figure 8-19 

When three nucleotides are added close 
together, the genetic message is scram- 
bled only over a short region. The same 
type of result is achieved by the dele- 
tion of three nearby nucleotides. 



Already, a large number of E. coli genes have been completely or 
partially sequenced. In all cases, the codons found to specify given 

amino acids are th nsp nrpdirtad Kx/ fV, a — rc: o 

. t — ~v ^ vuuc ^ iguic o-^.uy . 

ims agreement between prediction and result, though inherently 
very satisfying, surprised no one, since the experimental evidence 
used to deduce the genetic code was effectively unassailable (see 
Chapter 15). Also as predicted, the coding segments of virtually all 
mRNAs start with the AUG codon and always conclude with a chain- 
terminating codon (UAA, UAG, or UGA). 

Untranslated Sequences at the 

Beginnings and Ends of mRNA Molecules 18 - 23 

When mRNA was first discovered, it seemed simplest to assume that 
the translation events would begin at one end of the molecule and 
then move along in steps of three nucleotides until the other end was 
reached. This was a very naive view, adopted before the discoveries 
that methionine initiates all polypeptide chains and that specific co- 
dons specify chain termination. Now we realize that untranslated 
sequences exist at both the 5' end of the mRNA, near which transla- 
tion begins, and at the 3' end, near which translation stops (Figure 
8-21). Hence, there must be internal signals in mRNA that mark the 
starting and stopping sites for translation. With the exception of a 
small purme-rich block of nucleotides that functions to position ribo- 
somes at the correct AUG start codon, the untranslated regions prob- 
ably play no role in translation and are of variable lengths, ranging 
from 20 to more than 100 nucleotides, depending on the particular 
mRNA species. 

These seemingly unnecessary extra sequences only make sense 



